Skip to content

feat: LeanStore-style zero-allocation read and write paths#877

Open
JohannesLichtenberger wants to merge 21 commits intomainfrom
feature/zero-alloc-write-singleton
Open

feat: LeanStore-style zero-allocation read and write paths#877
JohannesLichtenberger wants to merge 21 commits intomainfrom
feature/zero-alloc-write-singleton

Conversation

@JohannesLichtenberger
Copy link
Member

Summary

  • Implement LeanStore-style slotted page architecture with flyweight node binding for zero-copy reads and zero-allocation writes
  • Remove legacy page format code (SlotOffsetCodec, slotMemory/slotOffsets, FixedSlotRecord*, CompactField*, NodeKindLayout*) and rename unifiedPageslottedPage
  • Add FlyweightNode interface to all 12 JSON and 7 XML node types — dual-mode getters/setters (bound: direct MemorySegment access, unbound: Java primitives)
  • Implement write-path singleton cursor pattern: one factory-managed singleton per node type, rebound per prepareRecordForModification call — eliminates ~310M DataRecord allocations on Chicago dataset import
  • WriteSingletonBinder functional interface wires JsonNodeFactoryImpl.bindWriteSingleton to NodeStorageEngineWriter
  • Write singletons skip records[] storage — slotted page heap is the single source of truth
  • Refactor adaptForInsert/adaptForRemove and all hashing methods to pre-capture values into local primitives before singleton-rebinding prepareRecordForModification calls
  • New PageLayout, NodeFieldLayout, FlyweightNodeFactory infrastructure for slotted page format

Key design

Read path (zero-copy): AbstractNodeReadOnlyTrx holds one flyweight singleton per node type. moveTo() binds the singleton to the slotted page slot — all getters read directly from page MemorySegment via offset table. No deserialization, no allocation.

Write path (zero-allocation): JsonNodeFactoryImpl holds one write singleton per type. prepareRecordForModification rebinds via WriteSingletonBinder instead of allocating. In-place mutation via setDeltaFieldInPlace writes directly to page memory. Width changes trigger unbind → persistUpdatedRecord re-serializes.

Singleton safety: All callers extract needed values into local primitives before any prepareRecordForModification call that could rebind the singleton. Three mechanisms: primary references (factory create), pre-capture + local primitives (adapt/hashing), sequential prepare + immediate use (per-phase persist).

Test plan

  • All sirix-core tests pass (1985 tests, 0 failures)
  • All sirix-query tests pass
  • Chicago dataset benchmark: verify throughput >= 4.3M nodes/sec
  • Profile with PROFILE_EVENT=alloc: verify DataRecord allocations near zero on write path

Johannes Lichtenberger added 21 commits February 22, 2026 22:59
…e format

Extend the LeanStore-style unified page architecture to XML nodes, enabling
zero-copy flyweight reads for all 7 XML node types (Element, Attribute, Text,
Comment, PI, Namespace, XmlDocumentRoot) — matching the JSON node implementation.

Key changes:
- All XML node types implement FlyweightNode with bind/unbind/serializeToHeap
- Add XML node field layouts to NodeFieldLayout.java
- Register XML nodes in FlyweightNodeFactory.java
- Add XML singletons to AbstractNodeReadOnlyTrx for zero-allocation reads
- Fix flyweight getHash() for value nodes (TEXT/COMMENT/ATTRIBUTE/NAMESPACE):
  when bound, read stored hash from MemorySegment to preserve rolling hash
  arithmetic correctness across sibling key changes
- Delete dead legacy serialization code from PageKind.java
- Delete unused node/layout package (CompactFieldCodec, CompactPageEncoder, etc.)
- Update optimized diff test expectations for improved hash-based optimization
…Page

Remove all dead legacy format code (slotMemory/slotOffsets/deweyIdMemory/
deweyIdOffsets) from KeyValueLeafPage now that every node type uses
FlyweightNode and the slotted page format. Delete SlotOffsetCodec and its
tests. Rename unifiedPage to slottedPage using standard PostgreSQL/database
terminology (Header + Bitmap + Directory + Heap = slotted page).
Eliminate per-write DataRecord allocations by reusing factory-managed
singleton flyweight nodes on the write path. prepareRecordForModification
now rebinds a type-specific singleton to the slotted page slot instead
of allocating a fresh object, and write singletons skip records[] storage
since their data lives directly on the page heap.

- Add writeSingleton flag to FlyweightNode interface and all 12 JSON types
- Add bindWriteSingleton dispatch to JsonNodeFactoryImpl (one singleton per type)
- WriteSingletonBinder functional interface on StorageEngineWriter
- prepareRecordForModification uses binder when records[offset] is null
- setRecord skips records[] for write singletons (page heap is canonical)
- Refactor adaptForInsert/adaptForRemove to pre-capture all values into
  local primitives before prepareRecordForModification calls
- Fix name index resolution for flyweight-bound ObjectKeyNodes (cachedName
  is a Java-only field, resolve from nameKey via storage engine)
… hot path

Replace ObjectKeyNodeCreator/SiblingNodeCreator lambda interfaces with
PrimitiveNodeType enum + switch dispatch to avoid per-insert lambda
allocation. Add getStructuralNodeView() to return singleton directly
without snapshot copy, capturing needed values into local primitives
before any cursor movement. Fix getPathNodeKey() to restore cursor via
moveTo(nodeKey) instead of retaining stale singleton reference. Use
raw byte[] instead of UTF-8 string decode in hash computation for
StringNode/ObjectStringNode.
Add moveToSingletonWrite() that resolves pages from the writer's
Transaction Intent List, enabling zero-allocation node navigation
during write transactions. Cached writer/reader references avoid
per-call instanceof checks. Refactored moveToSingletonSlowPath
into shared moveToSingletonFromPage for code reuse.
…loss

setDeweyID(SirixDeweyID) sets sirixDeweyID but clears deweyIDBytes=null
(lazy serialization). toSnapshot() read deweyIDBytes directly, missing
the sirixDeweyID value. This caused null DeweyIDs in snapshots created
by moveToSingletonWrite, which then propagated through persistUpdatedRecord
to commit-time serialization.

Replace deweyIDBytes with getDeweyIDAsBytes() in all 17 node types'
toSnapshot() methods (both bound and unbound paths, 34 occurrences).
- Add clearBinding() to FlyweightNode interface (19 implementations):
  nulls page reference without materializing fields back to Java
  primitives, saving O(N) field reads when all fields will be
  overwritten by factory setters

- Remove isBound()/unbind() guard before bind() in bindWriteSingleton:
  bind() overwrites all fields unconditionally, the guard was pure
  overhead (2 megamorphic itable calls per bind, was 1.11% CPU)

- Replace getSingletonForKind 19-case switch with array-based O(1)
  lookup via singletonByKindId[] (was 0.83% CPU)

- Replace unbind() with clearBinding() in all 11 factory bind methods
  and setNewRecord: avoids expensive field materialization when all
  fields are about to be overwritten

- Inline pageKey computation (nodeKey >> NDP_NODE_COUNT_EXPONENT) and
  recordPageOffset (nodeKey & 0x3FF) in moveToSingleton/Write paths:
  eliminates virtual assertNotClosed() + switch on every moveTo

- Cache getSlottedPage() in local variable in moveToSingletonWrite

Profile: 165s baseline -> 155s after optimizations (G1GC, Chicago)
Key wins: bindWriteSingleton itable 1.11%->0.39%,
          getSingletonForKind 0.83%->0.45%,
          StringNode/ArrayNode.bind inlined by JIT

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)
The insert path previously did:
  moveTo(nodeKey) → adaptForInsert(getStructuralNodeView()) → moveTo(nodeKey) → hash

Now does:
  adaptForInsert(nodeKey, parentKey, leftSibKey, rightSibKey) → hash(nodeKey) → moveTo(nodeKey)

Net: eliminated 1 moveTo per node insert (was 6.58% of CPU).

Changes:
- adaptForInsert now accepts 6 structural keys directly instead of
  reading from cursor — the caller already knows these from the factory
- adaptHashesWithAdd/rollingAdd accept startNodeKey parameter, avoiding
  a redundant moveTo to position cursor just to read the node key
- All 13 call sites updated to pass keys directly
- insertPrimitiveAsChild: hoisted leftSibKey/rightSibKey out of
  if/else blocks for scope visibility

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)
Skip serializeToHeap() in setNewRecord() and setRecord() on the hot
write path. Instead, store toSnapshot() copies in records[] and defer
all serialization to processEntries() at commit time.

This eliminates ~16% CPU overhead from MemorySegment API bounds checks,
session validation, and DeltaVarIntCodec encoding/decoding that was
happening 2-3x per node insert on the hot path.

Chicago benchmark: 143s → 79s (44.8% improvement, ~3.9M nodes/sec).

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)
…ite path

Replace singleton reuse + toSnapshot() with direct constructor calls in
all 11 factory bind methods. This eliminates value.clone() overhead for
StringNode/ObjectStringNode and the toSnapshot object copy for all types.

setNewRecord now checks isWriteSingleton: non-singletons are stored
directly in records[] with zero copy. Singletons (used by the
writeSingletonBinder for post-commit reads) are still snapshotted.

Chicago benchmark: 79s → 78s (marginal, but cleaner architecture).

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)
…avings)

ObjectNode was the only node type using a fixed 73-byte layout with raw
JAVA_LONG_UNALIGNED reads. All other 18 node types already used the
LeanStore-style delta/varint encoding with per-record offset tables.

Revert ObjectNode to the same format: [nodeKind:1][offsetTable:10x1byte]
[varint data fields + 8-byte hash]. Records shrink from 73 to ~35 bytes,
saving ~12 GB heap for the 310M-node Chicago dataset.

Also fix stale test that expected eager heap serialization in setRecord
for non-singleton unbound FlyweightNodes — setRecord now correctly defers
serialization to processEntries at commit time.
createRecord serializes FlyweightNode directly to slotted page heap via
serializeNewRecord, bypassing records[] entirely. persistRecord no-ops
for bound write singletons since mutations already live on the heap.

prepareRecordForModification gains a zero-copy fast path: raw slot bytes
are copied from the complete page and the singleton is rebound, avoiding
deserialize-serialize round-trips.

slottedPage is reinterpreted to Long.MAX_VALUE to eliminate JIT bounds
checks on MemorySegment get/set; actual capacity tracked separately in
slottedPageCapacity. All 27 JSON+XML node types gain ownerPage lifecycle
and resize-in-place support for varint width changes.
adaptPathForChangedNode calls resetPathNodeKey which internally calls
prepareRecordForModification for other DOCUMENT nodes, rebinding the
same factory write singleton to a different record. The subsequent
setPathNodeKey then corrupted the wrong node, producing a different
path summary tree structure that led to ClassCastException (DeletedNode
cannot be cast to StructNode) in PathSummaryWriter.removePathSummaryNode.

Fix: save the node key before path summary adaptation and re-acquire
the singleton afterwards via prepareRecordForModification.
…bug logging

ResourceStoreImpl.close() removed sessions from its local map but never
called allResourceSessions.removeObject(), leaving stale entries in the
global PathBasedPool. This caused cascading test failures when
deleteEverything() tried to remove databases with phantom open sessions.

Also removed all appendAgentDebugLog instrumentation from hot paths
(JsonNodeTrxImpl, NodeStorageEngineWriter, Databases, FileChannelReader,
FileChannelWriter) — file I/O on every node insertion was devastating
for shredding throughput.
Guard notifyPrimitiveIndexChange call sites with
indexController.hasAnyPrimitiveIndex() to eliminate redundant moveTo
calls, pathNodeKey resolution, and getCurrentNode() snapshot allocations
when no user-defined indexes are present (the common case for bulk
imports like Chicago 3.6GB). Also replace getCurrentNode() with
getStructuralNodeView() at guarded sites for zero-alloc when indexes
do exist.
…width change

Replace the unbind() + resizeRecord() cold path in ~100 setter methods
across all 19 node types (12 JSON + 7 XML) with resizeRecordField(),
which performs a targeted single-field raw-copy resize directly on the
slotted page heap using three bulk MemorySegment.copy() calls.

Phase 0: Add DeltaVarIntCodec.resizeField() + FieldEncoder interface
for raw-copy resize of individual offset-table fields. Add
KeyValueLeafPage.resizeRecordField() which allocates new heap space,
delegates to resizeField(), copies DeweyID trailer, updates directory,
and re-binds the flyweight.

Phase 2: Extract each setter's cold path (varint width mismatch) into
a private resize method that calls resizeRecordField(). Hot path
(in-place write when width matches) is unchanged. setDeweyID and
setValue/setRawValue retain the old unbind pattern (trailer/payload
changes, not offset-table fields).

Key benefits:
- Zero allocations on resize cold path (no unbind materialization)
- Only the changed field is re-encoded; unchanged fields are bulk-copied
- MemorySegment.copy() uses AVX/MOVSQ platform intrinsics
- Cold path extracted to private methods for JIT optimization
- Flyweight stays bound throughout (no unbind/re-bind state machine)

2000 tests pass, 0 failures.
Add static writeNewRecord() to all 7 XML node types (Element, Attribute,
Text, Comment, PI, Namespace, XmlDocumentRoot) and refactor XmlNodeFactoryImpl
to use the allocateForDocumentCreation → writeNewRecord → completeDirectWrite
pipeline, matching the JSON pattern. This eliminates ~38 redundant ops per
node creation (13 branch checks, 13 field writes, 10 field reads, 2
clearBinding calls) by encoding directly from parameters to MemorySegment.

Also adds initForCreation() to JSON nodes and refactors JsonNodeFactoryImpl
bind methods to use it, plus adds allocateForDocumentCreation infrastructure
to StorageEngineWriter/NodeStorageEngineWriter.

Fix isCompressed() in PINode, CommentNode, StringNode, and ObjectStringNode
to lazy-read from MemorySegment when bound, ensuring the slotted page is the
single source of truth for all serialized data. Removes redundant
setCompressed() calls from factories.
…ove dead code

Eliminate byte[] allocation on every computeHash() call across all 15+
node types by routing through hashDirect() which hashes directly from
the backing MemorySegment — no legacy ByteBuffer, no toByteArray() copy.

- Add PooledBytesOut.hashDirect(): heap segments use hashBytes(array),
  native segments use hashMemory(address, len)
- Fix MemorySegmentBytesOut.hashDirect() off-heap path: hashMemory()
  instead of legacy asByteBuffer()
- Migrate 11 flyweight nodes from hashBytes(toByteArray()) to hashDirect()
- Simplify 4 nodes (TextNode, CommentNode, JsonDocumentRootNode,
  XmlDocumentRootNode) from inline ByteBuffer cast to hashDirect()
- Remove 17 dead bind methods from JSON/XML factories
- Remove 12 dead initForCreation() from all JSON node types
- Remove 2 dead nextNodeKey() from both factories
- Remove unused imports from JsonNodeFactoryImpl
Remove all java.nio.ByteBuffer usage from the node, BytesOut, and
BytesIn layers. ByteBuffer is now confined to only the I/O boundary
(FileChannelWriter/Reader) where Java's FileChannel API requires it.

- Remove BytesOut.write(long, ByteBuffer, ...) dead API method and
  both implementations (MemorySegmentBytesOut, PooledBytesOut)
- Change MemorySegmentBytesOut.underlyingObject() to return
  MemorySegment directly instead of asByteBuffer() wrapper
- Update FileChannelWriter to call asByteBuffer() at I/O boundary
- Remove Bytes.wrapForRead(ByteBuffer) overload, fix FileReader
  caller to use byte[] overload directly
- Delete dead BytesUtils class (zero callers)
- Remove unused ByteBuffer imports from 24 node files
- Remove unused ByteBuffer import from Writer interface
… path

Devirtualize bindWriteSingleton() in both Json/XmlNodeFactoryImpl via
concrete-type switch dispatch (12 JSON + 7 XML cases), eliminating 3
itable stubs per bind. Change completeDirectWrite() to accept nodeKindId
instead of FlyweightNode, moving bind()/setOwnerPage() to callers where
concrete types are known (monomorphic dispatch).

Add setDeweyIDBytes(byte[]) to Node interface and all 19 flyweight node
types — stores raw bytes without parsing SirixDeweyID constructor. Full
parse deferred to first getDeweyID() call. Eliminates ~2.7% CPU from
DeweyID reconstruction on every moveTo and bind.

Profile result: 131s → 106s (19% faster), itable stubs 6.1% → 2.1%,
DeweyID parsing eliminated from hot path entirely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant